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ABSTRACT 

The statistical properties of a map of the primary fluctuations in the cosmic microwave back- 
ground (CMB) may be specified to high accuracy by a few thousand power spectra measure- 
ments, provided the fluctuations are gaussian, yet the number of parameters relevant for the 
CMB is probably no more than about 10 — 20. There is consequently a large degree of redun- 
dancy in the power spectrum data. In this paper, we show that the MOPED data compression 
technique can reduce the CMB power spectrum measurements to about 10-20 numbers (one 
for each parameter), from which the cosmological parameters can be estimated virtually as ac- 
curately as from the complete power spectrum. Combined with recent advances in the speed 
of generation of theoretical power spectra, this offers opportunities for very fast parameter 
estimation from real and simulated CMB skies. The evaluation of the likelihood itself, at 
Planck resolution, is speeded up by factors up to ~ 10*, ensuring that this step will not be the 
dominant part of the data analysis pipeline. 
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1 INTRODUCTION 

It has been recognised for roughly a decade that detailed study 
of the power spectrum of the fluctuations in the CMB could be 
used to obtain high precision values for several of the cosmologi- 
cal parameters, such as Q,q, Hq and JIa (Bond & Efstathiou 1987, 
Kamionkowski, Spergel & Sugiyama 1994, lungman et al. 1996). 
The physics of the CMB is much more straightforward than the 
complicated processes which affect the large-scale structure of the 
Universe, making it a much more promising laboratory for accu- 
rate parameter estimation. The main complications are the pres- 
ence of foreground sources at microwave frequencies and proper 
accounting of instrumental noise effects, but recent balloon ex- 
periments. Boomerang (de Barnardis P. et al. 2000), MAXIMA 
(Hanany et al. 2000) and DASI (Pryke et al. 2001) have demon- 
strated that the main scientific goal is achievable with current tech- 
nology. As experiments become more ambitious, the data process- 
ing requirements become more demanding, and the current datasets 
have sufficiently many pixels (~ IC* — 10^) that the data pro- 
cessing is already quite challenging. Even the first measurement 
of the CMB fluctuations, produced by the Cosmic Background Ex- 
plorer (COBE) satellite (Smoot et al. 1992) produced a dataset with 
enough pixels (~ 4000) for data compression techniques to be 
valuable (Gorski 1994; Gorski K. et al. 1994; Bond 1995; Bunn 
& Sugiyama 1995). For the satellite experiments MAP (the Mi- 
crowave Anisotropy Probe) and Planck (the Planck Surveyor Satel- 
lite), data compression will be vital. Each will provide very large 
datasets, with close to all-sky coverage with a resolution of up 
to 5 arcminutes, and ~ lO" — 10^ pixels. The standard radical 
compression method is to reduce the map to a set of power spec- 
trum estimates (see e.g. Bond, laffe & Knox 1998). In principle 



this compression can be lossless, if the map is a gaussian random 
field (as closely predicted by inflation: see e.g. Gangui et al. 1994; 
Verde et al. 2000; Wang & Kamionkowski 2000), as all the statisti- 
cal properties of the map are calculable from the power spectrum. 
The power spectrum data, typically a few thousand numbers for a 
high-resolution experiment, can then be used to estimate cosmo- 
logical parameters to an accuracy of a few percent. The steps in 
the distillation of the raw data to the cosmological parameters are, 
however, not necessarily straightforward computationally (see e.g. 
Wright 1996; Muciaccia, Natoli & Vittorio 1997; Tegmark 1997a; 
Tegmark 1997b; Bond et al. 1999; Olive, Spergel & Hinshaw 1999; 
Borrill 1999; Wandelt, Hivon & Gorski 2000; Szapudi et al. 2001; 
Natofi et al. 2001; Hivon et al. 2001; Christensen et al. 2001). 
This paper addresses one aspect of this problem: parameter estima- 
tion from the power spectrum. MOPED^ is an eigenvector-based 
method for data compression and parameter estimation, originally 
developed for computing star-formation histories from galaxy spec- 
tra (see Heavens, limenez & Lahav 2000, hereafter HJL; Reichardt, 
Jimenez & Heavens 2001). It can also be employed for very ac- 
curate, and extremely fast, parameter estimation from the CMB. 
The speed-up over brute-force maximum likelihood method is de- 
pendent on the experiment: typical speed-up factors expected for 
MAP and Planck are between lO'^ and 10^. MOPED is much more 
powerful than necessary, in fact, as parameter estimation will be 
dominated by the time it takes to run predictions for cosmological 
models, or other steps in the analysis pipeline. 

The method is based on a technique developed by HJL for 



* MOPED (Multiple Optimised Parameter Estimation and Data Compres- 
sion) has patent protection 
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compressing and analysing galaxy spectra. In that paper, it was 
shown that datasets with certain noise properties offered possibili- 
ties for very radical linear compression of the data without any loss 
of information about the parameters which determine the data. The 
requirement is for a datasct whose mean depends on the parameters, 
but the covariance of the noise does not. In these circumstances, it 
is possible to find a set of linear combinations of the data which are 
locally sufficient statistics for the parameters - i.e. the compressed 
data contain as much information about the parameters as the full 
dataset, and in this sense the compression is lossless (strictly, the 
Fisher matrix is unchanged, so the likelihood surface is known to 
be unchanged only locally near the peak). The compressed dataset 
can be extremely small - it consists of a single number for each 
parameter. Thus for highly redundant datasets, the degree of com- 
pression can be very large. 

It is important to recognise that the data compression can still 
be done even if the assvmiptions for lossless compression do not 
apply. The main assumptions are that the information is contained 
in the mean of the data, not in their variance, and that the fiducial 
model is correct. Violation of neither of these is serious for CMB 
power spectrum analysis. In HJL, for example, the data compres- 
sion algorithm was applied to the case of galaxy spectra, where the 
noise includes a photon counting noise term which is dependent 
on the mean number of photons in the spectral channel, and hence 
does depend on the parameters of the galaxy. The compressed data 
can still be used for parameter estimation, but the error bars on 
the derived parameters are fractionally larger than by using the full 
spectrum. The same situation arises in the CMB: under general as- 
sumptions, the cosmic variance on a power measurement is propor- 
tional to the square of the power itself, and therefore is dependent 
on the underlying parameters. The data compression, although not 
lossless, is still highly efficient: conditional errors should increase 
by a factor ^ 1 /£max for an experiment measuring multipoles up 
to £max. The time required for a brute-force likelihood evaluation 
is broadly comparable to the time it takes to compute theoretically 
the power spectnun of a model, using CMB FAST (Seljak & Zaldar- 
riaga 1996). Significantly, this part of the process has been acceler- 
ated recently by a factor of 10"^ (Tegmark, Zaldarriaga & Hamil- 
ton 2001), making it much faster to compute the theoretical power 
spectra than computing a brute-force likelihood measurement.. The 
relative timings for these two steps can determine the analysis strat- 
egy, since if the computation of the theoretical power spectrum is 
small in comparison with the likelihood evaluation, on can calcu- 
late the power spectrum 'on the fly' as one searches through pa- 
rameter space. A useful goal is therefore to make the likelihood 
evaluation much quicker than computation of the theoretical power 
spectrum. One can already speed up this process by using variants 
of the Newton-Raphson method (see, e.g. Bond et al. 1999), and 
one can argue that the power of MOPED is not strictly necessary 
for this problem. However, it is possible that calculations of theo- 
retical power spectra will be accelerated still further, but this paper 
shows that, with MOPED, the analysis need never be dominated by 
likelihood evaluations. 

In this paper, we demonstrate that MOPED does success- 
fully recover cosmological parameters from simulated datasets, but 
many orders of magnitude more quickly. We also show that the pa- 
rameter errors are similar to the full maximum likelihood solution. 



2 MASSIVE LOSSLESS DATA COMPRESSION 

The method is detailed in HJL, so we only sketch details here. We 
define the data vector x as the estimates of the power spectrum 
{Ci}, where I is the angular multipole, in terms of signal C(, and 
noise nt. 



(1) 



where 6a are the set of cosmological parameters on which the CMB 
power spectnmi depends. The noise is assumed to have zero mean, 

so 



(2) 



and the noise covariance matrix, including cosmic variance and in- 
strument noise, is Afui = {neriii ) . Angle brackets indicate ensem- 
ble averages; these are calculable analytically for some algorithms 
of power spectrum estimation (see e.g. Tegmark 1997b), but for 
others, e.g. based on correlation functions (Szapudi et al. 2001), a 
Monte Carlo approach is required. In practice this should be the 
covariance of the estimates of the power spectrum. Since this is de- 
pendent on the algorithm used to estimate the power spectnmi, we 
assume for illustration only cosmic variance, modelled as gaussians 

with variance (2J+1). but in addition we do correlate the power 
spectrum estimates to mimic partial sky coverage. This approxima- 
tion may not be good, especially for low multipoles. Bond, Jaffe & 
Knox (2000) have argued that the distribution may be closer to an 
offset lognormal, in which case one can transform the power spec- 
tnun estimates to quantities which have nearly gaussian marginal 
distributions. The calculation we show is illustrative, but Planck 
will be cosmic variance limited up to high £. 

The brute force maximum likelihood method, which uses all 
the power spectrum data points, is the method of estimation which 
for a large dataset will provide the smallest errors, assimiing uni- 
form priors. The Ukelihood for the N parameters is 



exp • 



(3) 



The difficulty is that at each point in parameter space one gen- 
erally computes the determinant of. and inverts, an x Af matrix. 
Since this scales as N^, it becomes a significant computational ex- 
pense, even with N ~ 2000. In this context, significant means that 
it exceeds significantly the time taken currently to generate the the- 
oretical power spectrum. 

We can speed up the likelihood evaluation by using MOPED 
to compress the N data in the measured Ce to one datum for each 
of M unknown parameters. The algorithm is detailed in HJL; it 
produces a set of weighting vectors ba (a = 1 . . . M), from which 
a set of MOPED components 



be • X 



(4) 



is constructed. The MOPED vectors are designed to make the 
Fisher information matrix 



9^ In C 



(5) 



the same whether we use the compressed data j/a or the full set 
of power spectrum estimates. In fact this is only possible if we ig- 
nore the dependence of cosmic variance on the parameters, but this 
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Figure 1. Optimised MOPED weighting vectors for a fiducial model with 
Ho = 65 km s"iMpc"\ Ucdm = 0.254 and = 0.7. The pai'ame- 
ter ordering (see text) is Q/^, Hq and f^cDA/. and the MOPED vectors bi, 
b2 , bs , are shown by the solid, dashed and dotted lines respectively. Deriva- 
tives of the power spectium have been calculated using finite-differences, 
which can cause the small glitches seen in this figure. 



restriction makes virtually no difference for a CMB dataset. The 
MOPED vectors satisfy the following (HJL equation 14): 



bfi 



(6) 



(7) 



and the summation convention is assumed, hea refers to the £ com- 
ponent of the vector labelled by a. Obvious modifications are made 
if the data does not include all £ values - the vector components re- 
fer to the list of modes considered. Note that the MOPED vectors 
depend on the order in which the parameters are listed: yi contains 
as much information about parameter 1 as possible. This vector 
also constrains parameter 2 to some extent; j/2 adds as much ad- 
ditional information as possible about parameter 2, etc. A set of 3 
MOPED vectors is illustrated in Fig. |l|, corresponding to vacuum 
energy density, Hubble constant and cold dark matter (CDM) den- 
sity. These vectors would ensure, under certain assumptions, that 
the MOPED components ya are uncorrelated, and of unit variance; 
if this is the case, the likelihood with these as the data is simply 



1 



(27r)3/: 



■ exp 



2 X!^^' - {yi)y 



(8) 



where the (yi) are computed from the noise-free (but smoothed) 
theoretical power spectra. Importantly, they ensure that the Fisher 
matrix for the compressed dataset {ya} is the same as for the entire 
set of power spectrum estimates. The marginal error on a single 
parameter is [{F^^)aa] ^ and the error on the parameter estimated 
using any method cannot be smaller than this (see e.g. Kendall & 
Stuart 1969; Tegmark, Taylor & Heavens 1997). Thus, by ensuring 
that the Fisher matrices coincide, the compression method can be 
described as locally lossless - the parameter errors, as estimated 



Figure 2. Simulated realisation of the CMB power used in the analysis. 

from the local curvature of the likelihood surface at the peak, are 
on average no larger for the compressed data than for the full set of 
power spectrum estimates. 

In detail, the assumptions required for locally lossless com- 
pression do not hold for this analysis. In order to calculate the 
MOPED vectors, the data covariance matrix, and the derivatives 
of the power spectrum with respect to the parameters, need to be 
known. These are fixed by assuming a fiducial set of parameters. 
We show below that this fiducial set is not important, but one can 
iterate the process if desired, at minimal extra computational ex- 
pense. Our results show that iteration is actually unnecessary. The 
second assumption is that the covariance matrix of the data is not 
dependent on the model parameters. This is not strictly true for 
the CMB power spectrum, as the noise includes a cosmic variance 
term which is dependent on the cosmology. However, this does not 
prevent us compressing the data, and, in fact the Fisher matrix is 
dominated by the sensitivity of the power spectrum itself to the pa- 
rameters, rather than the sensitivity of the noise. 

A few remarks on speed are in order. With A'^ power spectrum 
estimates, brute-force likelihood calculations require 0{N^) cal- 
culations. With M parameters, MOPED requires 0{M) operations 
per likelihood evaluation. In addition, there are 0{MN) operations 
to compute the {yi) quantities, but these can be done in advance if 
a library of theoretical power spectra is built up prior to analysis of 
the data. This point is potentially important for Planck; libraries of 
theoretical power spectra (and {yi)) can be constructed in the years 
before launch; if so, the parameter estimation step can be a very 
fast process, utilising interpolation of the {yi) if desired. 

In addition to this, there is a one- off 0{MN^) operation to 
compute the MOPED vectors. The number of likelihood evalua- 
tions required to find the maximum is not easy to compute a pri- 
ori, but is likely to depend exponentially on AI, so for a large- 
dimensional parameter space, the overhead in computing the vec- 
tors is negligible in comparison with time spent in searching the 
space. 



3 RESULTS 

We simulate a CMB dataset by adding gaussian noise, at the level 
of cosmic variance, to theoretical power spectra produced by CMB- 
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Figure 3. The true model spectrum (solid), with J^o = 65 kms~^Mpc~^, 
£1a = 0-V and CcDM = 0-254, with gaussian noise and smoothed in I with a 
gaussian of width Al = 5. Also shown (dotted) is the fiducial model used 
in the data compression for fig-^ Hq = 69.8 kms~^Mpc~^, Ha = 0.758 
and S^cDM = 0.254, both smoothed with a gaussian of width A£ = 5. The 
boxes show the data points used for the likelihood calculations. 
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Figure 4. Likelihood surface for SIa and Ha obtained from the the full 
dataset. This dataset consists of 150 power spectrum estimates from I = 
2, . . . , 1500 in steps of 10, smoothed over a scale of = 5. The true 
model is labelled with a square . The likelihood contours are too small to 
see individually for this experiment; the outer contour contains 99.99% of 
the probability, assuming uniform priors. 



FAST. The power spectrum is convolved with a gaussian of chosen 
width, to mimic approximately the correlations in power spectrum 
estimates introduced by partial sky coverage. The dataset consisted 
of the power spectrum sampled in even steps in I. The model cho- 
sen has Ho = 65 kms~^Mpc"\ f^A = 0.7 and ^cdm = 0.254. 
The unconvolved power spectrum is shown in fig. |^ and the con- 
volved spectrum in fig. ^. 

We calculate the full (equation ^ and compressed (equation 
likelihoods, varying the calculation in the following ways: 

• We mimic the effects of partial sky coverage by convolving 
the power spectrum with a gaussian window function of various 
widths; we present results for a width of /S.I = 5. 

• The size of the dataset N is varied by changing the upper mul- 
tipole limit of the available data, or by missing out some Ci values. 

• We explore different fiducial models, to see if the method is 
sensitive to an accurate initial guess of the parameters. 

We fix most of the cosmological parameters. The values are not par- 
ticularly important, but are listed here: fis = 0.05; scalar spectral 
index n = 1; no tensor modes; no massive neutrinos; 3 massless 
neutrinos. The parameters we allow to vary are the vacuum energy 
density parameter JIa, the CDM density parameter Q.cdm and the 
Hubble constant Hq, although we only display likelihood surfaces 
in the Oa — Hq plane, with Q,cdm fixed. 

Figure ^ shows the Hq — Q,\ likelihood surface using the 
power spectrum of Figure^ up to = 1500 in steps of 10. The power 
estimates were smoothed with a gaussian of width 5. The calcula- 
tion of this grid of likelihoods took 9463 seconds of CPU on an 
alpha workstation. Figure ^ shows the likelihood using 3 MOPED 
components as compressed data. An incorrect fiducial model {Ho 
= 69.8 kms~^Mpc^\ Q,a = 0.758, Q.cdu = 0.254) was chosen, to 
illustrate that its choice is not important. The true solution is still 
recovered accurately, but much faster: 0.00098 seconds, or an im- 
provement of order 10^. 

In order to check that the compressed data recover the param- 
eters as accurately as the full data, we degrade the experiment, trun- 
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Figure 5. Likelihood surface for f^A and Hq obtained from the the 3 
MOPED components. The fiducial model used for the data compression 
no longer coincides with the true model, and is marked by a triangle. Note 
that the method still recovers the coiTect model (square). 

eating the data to ^ = 2, . . . , 300, in steps of 10 (fig. | and ^. The 
method is designed to ensure that the error bars should be almost 
the same as the full likelihood on average, and we see that for this 
realisation the errors are indeed comparable. The full likelihood 
calculation takes 1406 seconds, while MOPED takes 0.00016 sec- 
onds. We see here that with a very poor fiducial model, MOPED 
still correctly finds the solution, within the errors, but there is a 
suggestion that the errors are only approximately correct. This can 
arise because the yi are assumed to be uncorrelated, and this is only 
strictly true if the fiducial model is correct, and even then it is only 
the ensemble average errors which are unchanged. In practice, this 
is not a problem, as we have a much better idea now of the shape 
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Figure 6. Likelihood from the full power spectrum, as in fig. Q but re- 
stricted to £ < 300 in steps of 10, to illustrate the size of the error bars. 
The contours represent confidence Hmits of 99.99%, 99%, 95.4%, 90%. and 
68%. The true model is labelled with a square. The likelihood was calcu- 
lated on a grid covering 60 < Hq < 70 km Mpc~^ in steps of 0.2, 
and 0.66 < Oa < 0.76 in steps of 0.002. 
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Figure 7. As fig. |^ but showing the likelihood from MOPED components. 
Note that the eiTor bars are comparable. 

of the power spectrum, so can choose a fiducial model which is far 
better than this one. Secondly, one can iterate, at very modest extra 
computational expense, computing new MOPED vectors from the 
best previous estimate. 



4 CONCLUSIONS 

The steps required to turn a set of power spectrum measurements 
Ce into estimates of cosmological parameters consist of 

• Computation of theoretical Ce 

• Calculation of likelihood of model parameters 

• Maximisation of likelihood and marginalisation 



Tegmark, Zaldarriaga & Hamilton (2001) have addressed the speed 
of the first step, accelerating CMBFAST (Seljak & Zaldarriaga 
1996) by a factor ~ 10''. This paper complements that analysis 
by speeding up the brute-force likelihood evaluation in the sec- 
ond step by even larger factors. For A'^ correlated data points, a 
brute-force likelihood evaluation using all the data scales as A'^^. 
MOPED reduces this to AI approximately uncorrelated, unit vari- 
ance components, whose likelihood evaluation scales with the num- 
ber of parameters M. For a Planck-size dataset with A'^ — 2000 
and AI ~ 12 parameters, the speed-up factor should be around 500 
million. In a sense MOPED is much more powerful than it needs to 
be, but this is hardly a criticism. With MOPED and the advances of 
Tegmark, Zaldarriaga & Hamilton (2001), parameter estimation is 
accelerated by a useful factor of ^ 10'^, and we can be fairly certain 
that the data processing element will be dominated by other steps 
in the analysis pipeline. 

The speed of MOPED may influence the analysis strategy; 
if the likelihood evaluation is slow in comparison with theoretical 
power spectrum generation, then one can compute the power spec- 
tra 'on-the-fly' in a search for the maximum likelihood. Given that 
the position is now reversed, there is a case for creating grids of 
theoretical models in the years before launch of Planck. If storage 
space becomes a limiting issue, one can store the expected MOPED 
components for each model, rather than the full d, with a compres- 
sion factor > 100. However, there may well still be a case for less 
rigid searches of parameter space, such as Markov Chain Monte 
Carlo methods (Christensen et al. 2001), since they can simulta- 
neously estimate the shape of likelihood surface around the peak, 
as well as finding the peak itself. MOPED can be combined with 
such methods to advantage. Finally we note that, for current exper- 
iments, data compression is not necessary, as there are relatively 
few band-power estimates available. 
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